A benchmark of Spanish language datasets for computationally driven research

نویسندگان

چکیده

In the domain of Galleries, Libraries, Archives and Museums (GLAM) institutions, creative innovative tools methodologies for content delivery user engagement have recently gained international attention. New methods been proposed to publish digital collections as datasets amenable computational use. Standardised benchmarks can be useful broaden scope machine-actionable promote cultural linguistic diversity. this article, we propose a methodology select computationally driven research applied Spanish text corpora. This work seeks encourage Latin American institutions based on best practices avoiding common mistakes.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Survey of Current Datasets for Vision and Language Research

Integrating vision and language has long been a dream in work on artificial intelligence (AI). In the past two years, we have witnessed an explosion of work that brings together vision and language from images to videos and beyond. The available corpora have played a crucial role in advancing this area of research. In this paper, we propose a set of quality metrics for evaluating and analyzing ...

متن کامل

Hybrid Heuristic Optimization for Benchmark Datasets

This paper introduces hybridization of particle swarm optimization (PSO) with genetic algorithm (GA) denoted as PSO+GA provides an efficient approach which is used to solve non linear chaotic datasets. The proposed algorithm employed in probabilistic neural network(PNN) which is a variant of radial basic function artificial neural network (RBFANN) for finding precise value spread factor for acc...

متن کامل

TweetNorm: a benchmark for lexical normalization of Spanish tweets

The language used in social media is often characterized by the abundance of informal and non-standard writing. The normalization of this non-standard language can be crucial to facilitate the subsequent textual processing and to consequently help boost the performance of natural language processing tools applied to social media text. In this paper we present a benchmark for lexical normalizati...

متن کامل

Language Resources for Spanish - Spanish Sign Language (LSE) translation

This paper describes the development of a Spanish-Spanish Sign Language (LSE) translation system. Firstly, it describes the first Spanish-Spanish Sign Language (LSE) parallel corpus focused on two specific domains: the renewal of the Identity Document and Driver’s License. This corpus includes more than 4,000 Spanish sentences (in these domains), their LSE translation and a video for each LSE s...

متن کامل

developing a pattern based on speech acts and language functions for developing materials for the course “ the study of islamic texts translation”

هدف پژوهش حاضر ارائه ی الگویی بر اساس کنش گفتار و کارکرد زبان برای تدوین مطالب درس "بررسی آثار ترجمه شده ی اسلامی" می باشد. در الگوی جدید، جهت تدوین مطالب بهتر و جذاب تر، بر خلاف کتاب-های موجود، از مدل های سطوح گفتارِ آستین (1962)، گروه بندی عملکردهای گفتارِ سرل (1976) و کارکرد زبانیِ هالیدی (1978) بهره جسته شده است. برای این منظور، 57 آیه ی شریفه، به صورت تصادفی از بخش-های مختلف قرآن انتخاب گردید...

15 صفحه اول

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Information Science

سال: 2021

ISSN: ['0165-5515', '1741-6485']

DOI: https://doi.org/10.1177/01655515211060530